Load Balancing
When NGINX acts as a reverse proxy, it can distribute incoming requests across multiple backend servers to achieve:
- High availability
- Better performance
- Scalability
- Fault tolerance
This is done using an upstream block.
Basic Upstream Configuration
upstream app_backend {
server 10.0.0.11:8080;
server 10.0.0.12:8080;
server 10.0.0.13:8080;
}
By default, NGINX uses round-robin.
Round-Robin Load Balancing (Default)
Requests are distributed sequentially across backend servers:
Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A
Each request goes to the next server in order.
Example Configuration
upstream app_backend {
server 10.0.0.11:8080;
server 10.0.0.12:8080;
}
server {
listen 80;
location / {
proxy_pass http://app_backend;
}
}
- First request →
10.0.0.11 - Second request →
10.0.0.12 - Third request →
10.0.0.11 - Even distribution over time
Weighted Round-Robin
upstream app_backend {
server 10.0.0.11 weight=3;
server 10.0.0.12 weight=1;
}
- Server
10.0.0.11receives 75% of traffic - Server
10.0.0.12receives 25%
Pros
- Simple
- Efficient for similar backends
- Default behavior
Cons
- Does not consider current load
- Not ideal for long-running requests
Least Connections (least_conn)
Each new request is sent to the backend with the fewest active connections.
Server A → 10 active connections
Server B → 3 active connections
→ New request goes to Server B
Example Configuration
upstream app_backend {
least_conn;
server 10.0.0.11:8080;
server 10.0.0.12:8080;
}
- NGINX tracks active connections
- New requests go to least busy server
- Excellent for uneven or long-lived requests
Weighted Least Connections
upstream app_backend {
least_conn;
server 10.0.0.11 weight=2;
server 10.0.0.12 weight=1;
}
NGINX factors weight into decision-making.
Pros
- Adapts to load
- Ideal for slow APIs or streaming
- Reduces overload
Cons
- Slightly more overhead
- Doesn’t track CPU or memory usage
IP Hash (ip_hash)
Client IP address is hashed to select a backend server.
Client IP → Hash → Server
Same client IP always maps to the same server (as long as it’s available).
upstream app_backend {
ip_hash;
server 10.0.0.11:8080;
server 10.0.0.12:8080;
}
- Client
203.0.113.10→ Server A - Client
203.0.113.10→ Server A again - Enables session persistence (sticky sessions)
Use Case
- Applications that store sessions in memory
- Legacy systems without shared session storage
Limitations
- Uneven distribution with NAT users
- Not compatible with weights (fully)
- Scaling changes may remap clients
Pros
- Simple session persistence
- No cookies required
Cons
- Poor distribution with many clients behind NAT
- Scaling issues
Choosing the Right Method
| Scenario | Best Method |
|---|---|
| Identical backends | Round-robin |
| Long-running requests | Least connections |
| In-memory sessions | IP hash |
| Modern apps | Least_conn + shared sessions |
Real-World Production Example
upstream web_backend {
least_conn;
server 10.0.0.11:8080 max_fails=3 fail_timeout=30s;
server 10.0.0.12:8080 max_fails=3 fail_timeout=30s;
}
server {
listen 80;
location / {
proxy_pass http://web_backend;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
- Least-loaded server gets traffic
- Failed servers temporarily removed
- Headers preserve client identity
Health & Failover Behavior
NGINX:
- Marks server as failed after
max_fails - Skips it during
fail_timeout - Automatically retries healthy servers